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Different dynamical processes on complex networks such as epidemic spreading, information prop- 
agation or cascading phenomena are highly affected by the underlying topologies as characterized 
by, for instance, the degree-degree association. Here, we introduce the concept of copulas in order to 
artificially generate random networks with a rich a prion degree-degree association structure. The 
accuracy of the proposed formalism and corresponding algorithm is numerically confirmed. The 
derived network ensembles can be systematically deployed as proper null models, in order to unfold 
the complex interplay between the topology of real networks and the dynamics on top of them. 

PACS numbers: 89.75.Hc, 02.50.Sk, 89.20.-a 



I. INTRODUCTION 

Drawing on the pertinent literature, network studies 
have provided substantial insights into the skeletal mor- 
phology of various systems, with examples as diverse as 
the human brain, online social communities, financial 
networks or electric power grids [THS]. Going beyond 
characterizing the network topology by the essential de- 
gree distribution, extensive research has focused on the 
degree-degree association [B]. A positive degree-degree 
association represents the tendency of nodes with a simi- 
larly small or large degree to be connected to each other. 
A negative degree-degree association accordingly implies 
that the nodes tend to be connected to nodes with a 
considerably different degree. Positive association is typ- 
ically found in social networks, while a negative associ- 
ation can often be observed in biological and technical 
ones [?]■ 

Generating artificial random networks with an a priori 
association structure is a prerequisite for systematically 
investigating real networks. Such null models can even- 
tually be used to shed light on the interplay between 
dynamical phenomena on networks and the underlying 
topology. Vivid examples range from information diffu- 
sion 15 and epidemic spreading O [10] in social networks 
to cascading failures in power grids [TTJ |T5] . The reshuf- 
fling method according to [131 [H] is commonly used in 
order to impose a desired level of degree-degree asso- 
ciation on random networks, as quantified by a single 
measure. While this is a straightforward algorithm, it 
appears to be incapable to fully control the overall as- 
sociation structure. This is a substantial drawback, as 
two networks with an equal association measure can ex- 
hibit significantly different association structures, even- 
tually implying different impacts on the dynamics on top 
of them. A first step towards this direction has already 
been proposed in [15] , by drawing upon two-point corre- 
lations of empirical networks. 

Here, we propose a new method for generating ran- 
dom network ensembles with a given degree distribution 
and a given degree-degree association structure by using 
copula functions. This allows to provide more compre- 
hensive null models with a complete description of their 



degree-degree association. The paper is organized as fol- 
lows. Section [lT| introduces the construction of probabil- 
ity matrices with an imposed degree-degree association, 
based on copulas. A general formalism for the realization 
of random networks based on a given probability matrix 
is provided in Section together with a description of 
the corresponding algorithm and its numerical evalua- 
tion. Section Hvl concludes. 



II. CONSTRUCTING THE PROBABILITY 
MATRIX 

The probability matrix as introduced in |16) approx- 
imates the degree-degree association structure by a bi- 
variate distribution of discrete random variables. This 
allows to generate different realizations of networks with 
the same underlying association structure. The probabil- 
ity matrix P{h, h') is the joint distribution of the number 
of edges h connected to the end of an edge, including the 
considered edge itself. The assignment and its difference 
to the node degree k is illustrated in Fig. [T] The marginal 
distribution of P{h,h') is the distribution Ph{h), which 
is related to the distribution of the node degree Pk{k) 
by Phih) = Y!lZho Pk{k)/ (k). Note that \k\ = \h\ for a 
specific node, implying hmax = kmax- A straightforward 
way to construct the probability matrix is the application 
of a bivariate discrete random distribution ^16) . However, 
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FIG. 1: (Color online) Different assignment of the number of 
edges, (a) K edges per node, (b) H edges connected to the 
end of an edge. 
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this approach suffers from the Umited number of discrete 
and especially heavy-tailed bivariate distributions. Dur- 
ing the recent decades, the use of copulas has hereby 
proved to be powerful to overcome the same shortcom- 
ing in the continuous case [T7Hl9j . The basic idea is to 
separate the marginal distributions from the association 
structure. 

For the continuous random variables X and Y the cop- 
ula C{u,v) is defined by the bivariate cumulative distri- 
bution function (CDF) Fxy{x,y) with the marginal dis- 
tributions Fx{x) and Fy{y) 

C{u,v)=F,y{F^\u),F;\v)), (1) 

where F^^ is the inverse function and u — Fx{x) and v = 
Fy{y). The simplest version of a copula is the application 
of the structure Fxy{x, y) to the random variables W and 

C{FM,Fz{z)) = F,y{F-\FUwj},F;\FM})- (2) 

The parameters of a defined copula are related to associ- 
ation measures, e.g., Kendall-Gibbons' or Spearman's 
Ps [11] ■ Two types of copulas with different association 
structures can have the same value of the respective as- 
sociation measure. Furthermore, the probability P that 
the random variables U and V are found in the intervals 
[ui,U2) and [vi,V2), respectively, is 

P{ui <U<U2,Vi<V<V2)= C{ui,vi) 

+ C{U2,V2) - C(ui,W2) - C{u2,Vi). (3) 

This formalism is used to construct the probability 
matrix P(h,h') with the marginal distribution Ph{h), 
whereas three different procedures can be followed. In 
procedure /, Ph{h) is always defined by a left bounded 
continuous CDF F.j;{x) 

Ph{h)=F,ih)^F,{h^l), (4) 

where ^ ^mm — h hmax and Xmin . — hjjiiji 1. 

Choosing a specific copula function and combining Eqs. 
[3]and|4] gives the probability matrix P{h, h') 

P{h, h') = F^yih, h') + F.,,y{h - 1, /l' - 1) 

-F^y{h^l,h')- F,,y{h,h' -I). (5) 

In the case that h is left and right bounded with hmax < 
oo, the probability matrix becomes truncated and has to 
be normalized, i.e., P{h, h') = 1, and the marginal 

distribution is recalculated with 

Pn{h)= £ P{h,h'). (6) 

For procedure II, the marginal distribution Ph{h) is 
given, and the probability matrix is written as 

P{h,h') = C{Gn{h),Gh{h')) 

+C{Gh{h^l),Gh{h' -I)) 
-C{Gh{h-l),Gh{h')) 
-C{Gh{h),Gh{h' -I)), (7) 
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FIG. 2: (Color online) Resulting probability matrices P{h, h') 
based on the Gumbel copula with A = 2. (a) Procedure 
I with a continuous Pareto marginal distribution, Fx{h) = 
1 — {h — with 7=1. (b) Procedure II with a Zipf 

marginal distribution, P{h) = h~'' /A, where A = '}2'h' 
and 7 = 2. (c) Procedure III with a truncated Zipf marginal 
distribution, P{h) = h"' /A, where A = T,'h2Z ^^'^ 
7 = 2. In all three cases hmin = 2 and hmax = 25. The 
color bar corresponds to all panels. 

where G{h) — Ph{j)- The matrix P{h,h') can 

again be truncated at hmax as in procedure I. 

For procedure III, the distribution Ph{h) is obliged to 
be truncated at hmax, implying G{hmax) = 1- The range 
of the copula is now limited, whereas the numerical dif- 
ferences between the resulting probability matrix and 
P{h, h') derived by procedure II become smaller with in- 
creasing hmax [H]. 

An example for the three procedures is the application 
of a Gumbel copula [18" with parameter A, 

G{u, v) = exp{-{{-ln{u))^ + {-ln{v))^){l/ X)). (8) 

Figure [2] depicts the probability matrices with \ — 2 
as derived by the three procedures. Interestingly, the 
different procedures are leading to considerably different 
association structures, although the same copula function 
and similar marginal distributions are applied. 

For a real network, the parameters of both the 
marginal distribution and the copula can be estimated 
by common methods of statistical inference, such as the 
maximum likelihood method. If the measure of associa- 
tion is given, the corresponding copula parameter (e.g., A 
in Eq. |8| can be numerically computed through an itera- 
tive process. Examples for resulting probability matrices 
for different values of Kendall-Gibbons' Tb based on the 
Gaussian and Gumbel copulas are shown in Fig. [3) The 
effect of the chosen level of association on the structure 
of P{h, h') is clearly visible [Figs. |3]ja)-[3](c)], while a dif- 
ferent copula function with equal Tb leads to considerably 
different association structures [Figs. [3][c)-[3]jd)]. 

III. REALIZATION OF NETWORK 
ENSEMBLES BASED ON P{h, h') 

A. Assignment probability 

Based on a given probability matrix P{h,h') the net- 
work generation draws on the assignment probability, as 
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FIG. 3: (Color online) Comparison of probability matrices 
P{h,h') with a truncated Zipf marginal distribution P{h) = 
^ '/X^h™""' ^'-'^ different values of Kendall-Gibbons' ti,. 
(a) Gaussian copula with T6=-0.3, (b) Gaussian copula with 
Tb—O, (c) Gaussian copula with Tb=0.3 and (d) Gumbel copula 
with T6=0.3. In all cases 7 = 1.5, hmin = 2 and hmax ~ 100. 
The matrices have been constructed following procedure III. 
The color bar corresponds to all panels. 



given by the bivariate distribution. We therefore consider 
an arbitrary sequence {hi, h2 ■ ■ ■ ,hi, . . . , hn) with sample 
size n, where the realizations are randomly distributed 
according to Ph{h). Letting n — > cx), the probability to 
assign a realization hi with position i to a given realiza- 
tion /i', Pi{i\h') (see Fig. 4|, can be derived from the 
probability matrix by recal ing the conditional probabil- 
ity P{h\h') = P{h, h')/Ph{h'), and using the relation 



P(/i|/i') = E^^(*l^')l-4.W, 



(9) 



with the indicator function l^^(i) = 1 if i e Ah, and 
l^h (*) = otherwise, where Ah denotes the set of equal 
realizations h. As P{h\h') = P{h'\h)Phih)/Phih') and 
= nPh{h) one easily computes 



P{h'\h) ^nPh{h')P,{i\h'). 



(10) 



Since nPh{h') is constant, P{h'\h) is proportional to 
Piii\h'). Furthermore, J2tPii^W) = 1- Thus: 



n 

P,{i\h') ^ P{h'\h,)/J2P(h'\hj), 



(11) 



being independent of the sample size n. 
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FIG. 4: (Color online) Assignment probability of the random 
variables. Note that several variables hi may have the same 
value and may belong to the same node in the network. 



1. Random generation of n realizations of drawn 
from the probability distribution Pk{k), imposing 
the constraint that the sum ki must be even. 
Hence, each node i has a total of ki "stubs" of 
edges. 

2. Random selection of a node with at least one re- 
maining stub and degree k' . 

3. Assignment of the selected node to a node with 
degree fc, having again at least one remaining stub. 
The assignment probability Pi{i\k') for connecting 
these two nodes to one another is given by 



P,{t\k') ^ rk,P{k'\k,)/y7-k^P{k'\k,) 



(12) 



Equation (12) is derived from Eq. by substi- 



tuting the variables h with k and h' with fc', respec- 
tively, and by considering all the remaining stubs 
Tfc. of the considered node i [see Fig. gb)]. The 
two selected stubs are connected to form the edge. 

4. If there are any remaining stubs go back to Step 2. 

The algorithm can be used to generate both connected 
and disconnected networks. Furthermore, the procedure 
is independent of how the underlying probability matrix 
P{h, h') has been derived - artificially based on the copula 
approach, or empirically estimated from real networks. 
Note that the number of edges has to be significantly 
higher than the maximum degree found in the network, 
so that the approximation with the probability matrix 
holds [16]. 



Numerical evaluation 



B. Description of the algorithm 



Based on the assignment probability (Eq. 11), the al- 
gorithmic procedure for realizing the network ensembles 
comprises the following steps: 



The probability matrix of an artificial or real network 
can be estimated according to the well-known empirical 
distribution function. 



P(/i, h') = m{h, h')/2L, 



(13) 
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FIG. 5: (Color online) Comparison between the underlying 
probability matrix P{h, h') and the averaged values {P{h, h')) 
of the corresponding network ensembles, consisting of 1000 
realizations. The networks have size A'^ = 1000, maximum 
degree hmax = 100 and minimum degree hmin = 3, respec- 
tively. The applied copula is C{u,v) = uv{l — a{l — u){l — v)) 
|21| . with (a) a = —0.5 for positively associated networks and 
(b) a = 0.5 for negatively associated networks. Following 
procedure III as described in Sec. |Tl] the degree distribu- 
tion is set to P{h) — h"^ /A with the normalization factor 

wherein m{h, h') is the number of realized pairs {h, h') 
and L is the number of edges, with each edge contribut- 
ing to two symmetric pairs. In order to vahdate the 
proposed aforementioned algorithm, we compare the av- 
erage {P{h, h')) of a large number of realized networks 



with the given determined probability matrix P(h,h'). 
Figures [5|a)-[5|^b) clearly confirm the agreement between 
them. Notice that for the positive association [Fig. [5ja)] 
the differences in the range of small P{h, h') values can 
be traced back to the limited number of realized large- 
degree nodes, naturally restricting the theoretical num- 
ber of connections between them [20 . Hence, high values 
of h') for large h and h' may constrain the bivariate 
approximation, whereas the values of h!) are usually 
larger for positive association in comparison to negative 
association. 



IV. CONCLUSIONS 

To sum up, in this paper we have introduced a copula- 
based method enabling the generation of random model 
networks with an a priori desired degree-degree associ- 
ation structure. The copulas are used to construct the 
underlying probability matrices which, in turn, form the 
basis for the realization of network ensembles. Our nu- 
merical investigations have demonstrated the accuracy of 
the proposed formalism and its algorithmic implementa- 
tion. The realized networks can be deployed as proper 
null models in order to systematically investigate the im- 
pact of rich topological structures on various dynamical 
processes, as found in real networks. Thereby, gaining 
experience in applying the proposed method will give in- 
sights in the most appropriate copula functions to repre- 
sent empirical networks. 

Acknowledgments 

M.R. acknowledges "swisselectric research" and Swiss 
Federal Office of Energy (project No. V155269) for co- 
funding the present work. K.T. acknowledges partial fi- 
nancial support by the Swiss Federal Office for Civil Pro- 
tection. 



[1] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and 

D. Hwang, Phys. Rep. 424, 175 (2006). 
[2] A. Barrat, M. Barthelemy, and A. Vespignani, Dynamical 

Processes on Complex Networks (Cambridge University 

Press, Cambridge, 2008). 
[3] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes, 

Rev. Mod. Phys. 80, 1275 (2008). 
[4] D. Lazer, A. Pentland, L. Adamic, S. Aral, A.-L. 

Barabasi, D. Brewer, N. Christakis, N. Contractor, 

J. Fowler, M. Gutmann, et al., Science 323, 721 (2009). 
[5] F. Schweitzer, G. Fagiolo, D. Sornette, F. Vega-Redondo, 

A. Vespignani, and D. R. White, Science 325, 422 (2009). 
[6] M. E. J. Newman, Phys. Rev. Lett. 89, 208701 (2002). 
[7] M. E. J. Newman, SIAM Rev. 45, 167 (2003). 
[8] M. Karsai, M. Kivela, R. K. Pan, K. Kaski, J. Kertesz, 

A. L. Barabasi, and J. Saramaki (2010), URL |http:// 



I arxi v.org/abs/1006. 2125| 

[9] M. KHsak, L. K. Gallos, S. Havlin, F. Liljeros, L. Much- 
nik, H. E. Stanley, and H. A. Makse, Nature Physics 6, 
888 (2010). 

[10] M. Schlapfer and L. Buzna, Phys. Rev. E 85, 015101(R) 
(2012). 

[11] M. Schlapfer and K. Trantopoulos, Phys. Rev. E 81, 
056106 (2010). 

[12] M. Schlapfer, S. Dietz, and M. Kaegi, Stress induced 
degradation dynamics in complex networks. In: Proceed- 
ings of the first international conference on infrastructure 
systems and services (INFRA 2008), p. 5 (2008). 

[13] R. Xulvi-Brunet and I. M. Sokolov, Phys. Rev. E 70, 
066102 (2004). 

[14] J. Menche, A. Valleriani, and R. Lipowsky, Phys. Rev. E 
81, 046103 (2010). 



5 



[15] S. Wcbcr and M. Porto, Phys. Rev. E 76, 046111 (2007). 
[16] M. Raschkc, M. Schlapfcr, and R. Nibali, Phys. Rev. E 

82, 037102 (2010). 
[17] A. Sklar, Publ. Inst. Statist. Univ. Paris 8, 229 (1959). 
[18] D. D. Mari and S. Kotz, Correlation and Dependence 

(Imperial College Press, London, 2001). 
[19] R. B. Nelsen, An Introduction to Copulas (Springer, New 

York, 2006). 



[20] M. Catanzaro, M. Boguiia, and R. Pastor-Satorras, Phys. 
Rev. E 71, 027103 (2005). 

[21] E. J. Gumbcl, J. Amer. Statist. Assoc. 55, 698 (1960). 

[22] Note that in the case of heavy-tailed Fx, the resulting 
marginal distributions Ph{h) in procedures I and II are 
not strictly heavy-tailed due to the truncation. 



